Nahla A Belal CBAS : Context Based
نویسنده
چکیده
Nahla A Belal CBAS: Context Based Arabic Stemmer Arabic morphology encapsulates many valuable features such as word’s root. Arabic roots are being utilized for many tasks the process of extracting a word’s root is referred to as stemming. Stemming is an essential part of most Natural Language Processing tasks, especially for derivative languages such as Arabic. However, stemming is faced with the problem of ambiguity, where two more roots could be extracted from the same word. On the other hand, distributional semantics is a powerful co-occurrence model. It captures the meaning of a word based on its context. In this paper, a distributional semantics model utilizing Smoothed Pointwise Mutual Information (SPMI) is constructed to investigate its effectiveness on the stemming analysis task. It showed an accuracy of 81.5%, with a at least 9.4% improvement over other stemmers.
منابع مشابه
CBAS: context based arabic stemmer
Arabic morphology encapsulates many valuable features such as word’s root. Arabic roots are being utilized for many tasks; the process of extracting a word’s root is referred to as stemming. Stemming is an essential part of most Natural Language Processing tasks, especially for derivative languages such as Arabic. However, stemming is faced with the problem of ambiguity, where two or more roots...
متن کاملNahla A Belal An Efficient Rank Based Arabic Root Extractor
Nahla A Belal An Efficient Rank Based Arabic Root Extractor A morphologically-rich language such as Arabic requires deep analysis this is due to its invaluable characteristics which are beneficial for the task of root extraction. This paper investigates employing new techniques to enumerate and rank possible roots for a given word, using linguistic rules as scoring mechanisms. The proposed tech...
متن کاملNahla A Belal Enhancing Root Extractors Using Light Stemmers
Nahla A Belal Enhancing Root Extractors Using Light Stemmers The rise of Natural Language Processing (NLP) opened new possibilities for various applications that were not applicable before. A morphological-rich language such as Arabic introduces a set of features, such as roots, that would assist the progress of NLP. Many tools were developed to capture the process of root extraction (stemming)...
متن کاملA Random Forest Model for Mental Disorders Diagnostic Systems
Nahla A Belal A Random Forest Model for Mental Disorders Diagnostic Systems Data mining has established new applications in medicine over the last few years. Using mental disorders diagnostic systems, data possession, and data analysis has been of enormous succor for clinicians to recognize diseases more precisely, especially when dealing with overlapping mental symptoms. In this study, random ...
متن کامل